Use of Sense Marking for Improving WordNet Coverage

نویسندگان

  • Neha Prabhugaonkar
  • Jyoti Pawar
چکیده

WordNet is a crucial resource that aids in several Natural Language Processing (NLP) tasks. The WordNet development activity for 18 Indian languages has been initiated in INDIA by the IndoWordNet1 consortium using the expansion approach with the Hindi WordNet developed by IIT Bombay, as the source. After linking 20K synsets, it was decided that each of these languages should find the coverage of their respective language WordNets by using sense marker tool released by IIT Bombay. The sense marking activity mainly helped in validation of WordNet and improving the WordNet coverage. In this paper, the various effects that sense marking activity had on the Konkani2 language WordNet development are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

WordNet―Wikipedia―Wiktionary: Construction of a Three-way Alignment

The coverage and quality of conceptual information contained in lexical semantic resources is crucial for many tasks in natural language processing. Automatic alignment of complementary resources is one way of improving this coverage and quality; however, past attempts have always been between pairs of specific resources. In this paper we establish some set-theoretic conventions for describing ...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Introduction to Tools for IndoWordNet and Word Sense Disambiguation

Lexically rich resources form the foundation to all NLP tasks. Maintaining the high quality of resources is thus a high priority issue. In this paper we exhibit the tools developed at IIT Bombay, for the purpose of creation, enhancement and maintenance of the WordNets, as well as the ones used for NLP tasks that use WordNets directly, like Word Sense Disambiguation. The paper presents online an...

متن کامل

Some Challenges of Automated Annotation in A Multilingual Scenario

A key ingredient of today’s NLP scenario is annotation and this paper discusses challenges involved in one of the toughest annotation tasks which is sense marking. A large amount of data needs to be sense marked accurately by human annotators in order to train the machine to understand the spoken languages. The sense marked corpus for various languages facilitate the task of Word Sense Disambig...

متن کامل

Concept Space Synset Manager Tool

The IndoWordNet 1 Consortium consists of member institutions developing WordNet using the expansion approach. The WordNets developed using expansion approach are very much influenced by the source language and may not reflect the richness of the target language (Walawalikar et al., 2010). And therefore the IndoWordNet Community decided to develop concepts which were specific to their respective...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014